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Some  Variations  on  Total  Variation— Based  Image  Smoothing 

Antonin  Chambolle/  Stacey  E.  Levine,^  and  Bradley  J.  Lucier^ 


Abstract 


satisfies 


In  this  paper  we  study  finite-difference  approximations  to  the 
variational  problem  using  the  BV  smoothness  penalty  that  was  intro¬ 
duced  in  an  image  smoothing  context  by  Rudin,  Osher,  and  Fatemi. 
We  give  a  dual  formulation  for  an  “upwind”  finite-difference  approx¬ 
imation  for  the  BV  seminorm;  this  formulation  is  in  the  same  spirit 
as  one  popularized  by  Chambolle  for  a  simpler,  more  anisotropic, 
finite-difference  approximation  to  the  BV  seminorm.  We  introduce 
a  multiscale  method  for  speeding  the  approximation  of  both  Cham- 
bolle’s  original  method  and  of  the  new  formulation  of  the  upwind 
scheme.  We  demonstrate  numerically  that  the  multiscale  method  is 
effective,  and  we  provide  numerical  examples  that  illustrate  both  the 
qualitative  and  quantitative  behavior  of  the  solutions  of  the  numeri¬ 
cal  formulations. 


(3)  ^^—^  +  dx\f\BY(I)^Q, 

where  dx^{g)  is  the  subdifferential  of  the  convex,  lower- 
semi-continuous  map  (p:  L2{I)  — >  M.  (See  [17]  for  defini¬ 
tions  and  the  basic  results  we  quote  here.)  Furthermore, 
if  we  set 

ll/-/llL(/)  =  ^^ 

then  cr^  is  a  continuous,  one-to-one,  increasing  function  of 
A  and 


1.  Introduction 


In  an  influential  paper,  Rudin,  Osher,  and  Fatemi 
[23]  suggested  using  the  bounded  variation  seminorm  to 
smooth  images.  The  functional  proposed  in  their  work  has 
since  found  use  in  a  wide  array  of  problems  (see,  e.g.,  [7]), 
both  in  image  processing  and  other  applications.  The  nov¬ 
elty  of  the  work  was  to  introduce  a  method  that  preserves 
discontinuities  while  removing  noise  and  other  artifacts. 
We  begin  by  giving  some  background  on  that  work. 

We  work  on  the  unit  square  I  =  [0,1]^,  where  the 
bounded  variation  seminorm  is  defined  as 


I/Ibv(/)  :=  \Df{x)\dx 


(1) 


:=  supl  /  /  V  -  p 


p:  I 


p  G  C'^(/),  \p{x)\  <  1  for  all  a;  G  /| 


The  formulation  of  BV(/)  smoothing  on  which  we  depend 
is  as  follows:  Given  a  nonconstant  function  /  and  a  number 
A  >  0,  find  the  function  /  that  minimizes  over  all  g 

(2)  ^Wf  -  9\\'l^{i)+ M9\by{i)- 

Fix  /;  for  each  A  this  problem  has  a  unique  solution  that 
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(4) 


Thus,  given  that  satisfies  (4)  there  is  a  unique  A  such 
that  the  solution  /  of  (2)  given  above  is  also  the  solution 
of  the  problem:  Find  /  that  minimizes  over  all  g  with 


11/  ~ 


the  functional 

lff|BV(7)- 

Rudin,  Osher,  and  Fatemi  introduced  BV(/)  image 
smoothing  in  the  latter  form,  but  these  two  formulations 
are  equivalent  [11]. 

More  recently,  the  BV  smoothing  technique  is  used  as 
just  one  step  in  the  so-called  inverse  scale  space  approach 
[20,  8]. 

In  the  continuous  setting,  the  behavior  of  the  solutions 
of  (2)  is  well  understood  (see  [22]  and  references  there-in). 
The  qualitative  properties  of  solutions  of  discrete  versions 
of  (2)  are  not,  perhaps,  as  well  known  or  understood.  In 
this  work  we  study  the  behavior  of  solutions  of  the  discrete 
approach  featured  in  [9]  as  well  as  an  “upwind”  variant 
of  this  model  that  better  preserves  edges  and  “isotropic” 
features. 

The  paper  is  organized  as  follows.  In  Section  2,  we 
discuss  the  algorithm  introduced  in  [9]  for  minimizing  a 
finite-difference  approximation  to  the  ROF  functional.  In 
Section  3,  we  propose  a  new  formulation  of  an  upwind 
finite-difference  approximation  to  the  bounded  variation 
seminorm,  and  show  how  this  can  be  used  to  minimize  the 
ROF  functional  in  a  manner  similar  to  [9] .  In  Section  4,  we 
introduce  a  multiscale  algorithm  that  greatly  reduces  the 
run  time  of  both  methods.  Section  5  contains  numerical 
examples  that  demonstrate  the  qualitative  properties  of 
the  algorithm  and  observed  rates  of  convergence  for  two 
special  problems  with  known  solutions. 


2.  Discrete  BV(/)  Variational  Smoothing. 

To  begin  we  consider  discretizations  of  (2).  A  stan¬ 
dard  approach,  as  first  suggested  in  [23],  is  to  regularize 
the  BV  seminorm  and  consider  the  problem  of  minimizing, 
with  e  >  0, 

2 11/ ~  5lli2(7)  +  ^  J 

This  functional  is  differentiable  in  g,  and  one  can  follow 
the  flow  of  its  associated  Euler-Lagrange  equation;  numer¬ 
ical  methods  approximate  this  flow.  This  is  sufficient  for 
some  applications,  but  the  solution  now  depends  on  the 
regularization  parameter  e. 

In  this  work  we  consider  a  discrete  analog  of  (2)  and 
follow  the  dual  approach  of  in  [9].  The  material  in  this 
section  is  classical;  we  refer  the  reader  to  [9]  for  a  more 
extended  treatment,  and  to  [13]  which  puts  [9]  into  some 
historical  context.  Another  approach  to  solving  the  dis¬ 
crete  problem  that  arises  here  is  through  second-order  cone 
programming  [18],  and  through  graph  cuts  [14],  where  one 
considers  anisotropic  approximations  to  the  usual  isotropic 
BV  seminorm. 

Given  V  >  1,  we  let  ft-  =  1/V  and  consider  discrete 
functions 


fi,  i  =  (ii,Z2),  0  <  ti,Z2  <  N. 

A  discrete  L^il)  norm  of  /  is  defined  by 

II/IIl5(/)  “  ^  l/»l  ^  ■ 

0<ii,z2<A^ 

One  way  to  compute  a  discrete  gradient  of  a  discrete  scalar 
function  fi  is  given  by 

/r-,  Yy  r  _  /  /l+(l,0)  fi  /l+(0,l)  fi\ 

(5)  ^  )■ 

For  any  discrete  gradient  operator,  one  needs  to  spec¬ 
ify  the  value  of  fi  for  some  values  of  i  outside  [0,iV)^;  to 
do  so,  we  need  to  specify  boundary  conditions.  We  as¬ 
sume  that  scalar  discrete  functions  g,  /,  etc.,  are  either 
periodic  (with  period  N)  or  satisfy  Neumann  or  Dirichlet 
boundary  conditions.  For  Dirichlet  boundary  conditions, 
fi  is  zero  for  i  outside  [0,  Nf^-,  for  Neumann  conditions,  we 
consider  /  to  be  reflected  across  the  lines  ii  =  V  —  1/2 
and  12  =  N  —  1/2  and  then  extended  periodically  across 
the  plane  with  period  2N. 

Given  a  discrete  gradient,  we  can  define  an  associated 
discrete  BV  seminorm 

I/Ibv'‘(/)  =  ^  |Vft/i|ft^ 

and  then  in  turn  a  discrete  analogue  to  (2), 

(6)  2^\f  -  9\\'l;^(^i)  + 

For  any  discrete  gradient,  we  define  an  associated  dis¬ 
crete  divergence  of  vector  functions  pi  =  given 


the  discrete  gradient  (5),  we  define  the  associated  diver¬ 
gence  by 


(7) 


-  Pi 


.(1) 


Pi-{1,0) 

ft 


„(2) 


ft 


As  for  boundary  conditions,  we  note  that  in  the  following 
we  compute  discrete  divergences  only  of  discrete  vector 
fields  that  are  themselves  discrete  gradients;  therefore  we 
compute  Pi  for  i  outside  of  [0,iV)^  in  a  manner  consistent 
with  whatever  boundary  condition  we  have  chosen  for  dis¬ 
crete  scalar  functions  fi. 

Because  of  how  the  discrete  gradient  and  divergence 
are  related,  we  have 


I/IbV'*(7)  —  X/  \'^hfi\h'^ 

0<ii,i2<N 

('fi'l  =  sup  V  (-V/i)/*-Kft^ 

=  sup  ft{yh-Pt)h'^- 

|Pd<l  0<ii,i2<N 

The  first  equality  is  obvious;  the  second  (which  can  be 
interpreted  as  “the  adjoint  of  the  discrete  divergence  is  the 
negative  of  the  discrete  gradient”)  follows  by  summation 
by  parts. 

Thus,  if  the  symmetric  convex  set  K  is  defined  by 

K=  {gi  =  Vh-Pi  \  \Pt\  <  1,  P*  = 

then 

I/Ibv'‘(7)  =  sup^/ig*ft^  =:  {f,g). 

g&K 


Using  classical  convexity  arguments,  the  first  author  [9] 
showed  that  the  minimizer  /  of  (6)  is  /  —  W h  •  P  with  p  a 
minimizer  of 


(9) 


F{p)  := 


-p 


f_ 

A 


2 


subject  to  the  constraint 


(10)  p  €  K  :=  {p:  [0,  V)^  — >  |  \pi\  <  1  for  all  i}. 


In  other  words, 

(11)  /  =  /-^A7f/> 

where  is  the  orthogonal  projector  in  Li/il)  of  /  onto 
the  convex  set  XK. 

Ghambolle  [9]  gave  a  specific  iterative  algorithm  for 
finding  a  discrete  vector  field  p  that  minimizes  (11).  He 
first  set  p/  =  0  for  all  i  (so  that  p°  is  obviously  in  K)  and 
then  calculates 

.,21  n+l  Pr-T(-Vl.)(Vl.-p"-//A). 

^  ^  ■  l  +  r|(-V;,)(Vz,-p"-//A).|- 

He  notes  that  p"^  G  K  for  all  n  and  shows,  using  the 
Karush-Kuhn- Tucker  theorem,  that  if  0  <  r  <  ft^/8  the 
limit  lim„^oo  p"  exists  and  gives  a  minimizer  of  (11)  over 


2 


all  p  with  \pi\  <  1.  Procedure  (12)  can  be  written  as  the 
two-step  process 


P”  -  t(-V,)(V,  •  p”  - //A)„ 


'1-I-1/2 


1  +  \P\ 


t+1/2 


where  the  first  formula  is  just  gradient  descent  of  the  func¬ 
tional  F(j))  with  step  r,  and  the  second  is  a  nonlinear  pro¬ 
jector  that  ensures  that  G  K  if  p^  G  K. 

In  [10],  Chambolle  speculates  whether  the  two-step 
procedure 


P^ 


(13) 


P^ 


,n+l 


P 


1-I-1/2 


/I  ^^1/2  \ 

max(l,  p-  ) 


yields  vector  fields  p”  such  that  Vh  •  p"'  converges  to  the 
projection  TT^{f/\).  Again,  the  first  half-step  is  gradient 
descent  along  the  functional  F{p)  given  in  (23),  while  the 
second  is  the  L^-projection  of  p  onto  the  set  K.  We  note 
at  the  end  of  the  next  section  that  this  iteration  is  a  special 
case  of  a  more  general  minimization  algorithm  developed 
originally  by  Eicke  [16],  which  itself  has  been  generalized 
by  Combettes  [12,  13].  There  is  recent  work  related  to 
algorithms  of  this  type  by  Aujol  [4]. 


3.  Upwind  BV  Smoothing 


One  might  think  that,  given  g  G  BV(/),  the  L2{I) 
projection  of  p  on  a  grid  with  sidelength  h 

(14)  p,  =  I^  =  h{I  +  i) 

or  the  (multi-valued)  Li{I)  projection  of  p  on  the  same 
grid 

Pi  =  any  m  such  that  \{x  G  U  \  g{x)  >  to}|  >1/2 
and  \{x  G  li  1  g{x)  <  m\\  >1/2 

would  satisfy 

(15)  |5|bv'‘(/)  =  |5|bv(/), 

but  this  is  not  true  in  general.  If  p  is  then  integration 
by  parts  in  (1)  shows  that  (15)  holds.  Again,  if  p  is  the 
characteristic  function  of  the  set  {xi  <  1/2}  or  {x2  < 
1/2},  or  even  {xi  +  X2  <  1},  then  (15)  holds,  but  if  p  is 
the  characteristic  function  of  {xi  <  X2}  and  we  use  the 
projection  (14)  then  a  calculation  shows  that 

(16)  2  =  Im  1p1bv'‘(/)  lfflBV(/)  =  . 

In  fact,  this  is  not  so  much  of  an  issue  as  far  as  min¬ 
imization  problems  are  concerned,  since  it  is  well  known 
in  this  case  that  the  correct  notion  of  convergence  is  T- 
convergence  [6],  which  ensures  convergence  of  the  mini- 
mizers  of  variational  problems.  It  can  be  shown  without 
much  difficulty  that  the  semi-norms  [-[syfe  T-converge  as 
— >  0  to  the  BV  seminorm.  However,  it  follows  from  the 


inequality  (16)  that  the  approximation  of  X{xi<x2}j  a 
possible  solution  of  a  minimization  problem,  will  be  possi¬ 
ble  only  after  some  smoothing  of  the  discontinuity,  so  that 
the  output  of  a  discrete  minimization  will  usually  not  be 
as  sharp  as  one  could  hope. 

This  issue  motivates  us  to  we  define  an  “upwind”  dis¬ 
crete  BV(/)  norm  of  a  discrete  scalar  function  p^  given  by 

(17)  bllBvV/)  ^  |(-V/,)p,  V  0| 


where  we  have  defined  the  discrete  gradient 


(18) 


(-V/j)pi  = 


^  9i  —  9i+(lfl)  ^ 
h 

9i  —  9i-(ip) 
h 

9i  ~  9i+{0,l) 
h 

Pi  —  Pi-(0,1) 
h 


and  we  denote  by  p  V  p  and  p/\q  the  componentwise  max¬ 
imum  and  minimum,  respectively,  of  the  vectors  p  and  p. 
(Similarly,  if  we  write  an  inequality  between  vectors,  p  <  q, 
then  we  mean  that  this  inequality  holds  componentwise.) 
This  type  of  operator  is  based  on  the  classical  first-order 
upwind  finite-difference  scheme  used  to  solve  hyperbolic 
partial  differential  equations;  upwind  methods  have  found 
important  applications  in  level  set  methods  [21]. 

Of  course,  in  the  present  case,  the  problem  we  are 
solving  is  degenerate  elliptic  and  there  is,  strictly  speak¬ 
ing,  no  direction  of  “wind”  so  that  our  choice  may  be  seen 
as  quite  arbitrary  (and,  in  fact,  the  reverse  direction  is  also 
an  admissible  choice).  However  it  still  produces  the  desired 
effect,  which  is  to  preserve  some  discontinuities  better  than 
standard  discretizations.  We  may  refer,  for  a  similar  idea, 
to  Appleton  and  Talbot  [3]  who  recently  proposed  to  com¬ 
pute  minimal  surfaces  by  solving  some  hyperbolic  system, 
discretized  with  an  upwind  scheme.  Also  in  their  case  the 
“speed”  and  “wind”  can  be  reversed,  still,  their  approach 
produces  sharp  discontinuities  as  desired  —  on  the  other 
hand,  it  does  not  really  correspond  to  the  minimization  of 
a  convex  discrete  functional  such  as  our  upwind  TV. 

In  (17)  we  include  a  difference  in  the  sum  only  if  the 
difference  is  positive,  i.e.,  the  discrete  function  gi  is  in¬ 
creasing  as  it  goes  to  gi  from  the  given  direction.  Note 
that  for  smooth  g{x)  this  is  a  convergent  approximation 
to  the  BV  seminorm  of  /  and  for  jumps  across  vertical, 
horizontal,  or  diagonal  lines  you  get  the  correct  value  of 
the  BV  semi-norm;  that  is,  lim;,^o  |5|bv'*(/)  =  |5|bv(/)- 

We  then  write  this  “upwind”  semi-norm  as 


'^\{-^h)9^'^  =  sup  y^(-Vii)gi  -  Pi  h^- 

i  |Pi|<l,  Pi>0  i 


Here  we  require  not  only  that  the  Euclidean  norm  \pi\  of 
Pi  G  M"'  be  no  larger  than  one,  but  also  that  each  coordi¬ 
nate  of  Pi  be  non-negative,  so  that  p  is  in  the  set 

(19)  K  :=  {p-.  [0,  NY  ^  ||Pi|  <  1  and  pi  >  0}. 
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Thus  we  have  dealt  with  the  “extra”  nonlinearity  of  (17) 
by  incorporating  it  into  the  convex  set  that  contains  p. 

If  we  now  define  the  discrete  divergence  that  is  the 
adjoint  of  the  discrete  gradient  (18), 

c(l)_y(l)  f(2)__e(2) 

<»)  ,(.)  )(»  )(4, 

Si  Si_(o,l)  Si  Si+(o,l) 

h  h 

and  again  apply  summation  by  parts,  we  see  that  this  new 
discrete  semi-norm  is  equal  to 

sup  ^  giUi  =  sup (5,  u) 

U  .  U 

I 


with  u  in  the  convex  set 

(21)  K:={Vh-p\p--[0,Nf^R\  \p^\ <  1,  Pi  >  o}. 


Thus,  as  in  [9],  the  minimizer  over  discrete  scalar 
functions  gi  of 

(22)  ^ll/-5f +  A|||(-V,)gV0|||i 


can  be  written  as  the  difference  between  /  and  the  unique 
projection  of  /  onto  the  convex  set  XK. 

In  other  words,  the  minimizer  of  (22)  is  /  —  AV/j  •  p 
where  p  is  any  minimizer  of  the  functional 


(23) 


F{p)  := 


Vft  -p 


/  2 

A 


subject  to  the  constraint  that  p  G  K,  where  K  is  defined  by 
(19).  An  iterative  method  to  compute  a  p  that  minimizes 
(23)  is  given  by 


(24) 


P^ 

n 

Pi 


■—Pi  ‘  \  h 

1+2/3  n+1/3  .  ,  ^ 

'  ■■=Pi  'VO, 

„n-|-2/3 

:=  _ ^ _ 

*  max(l,  p"+2/3  ^  ’ 


where  p^  is  chosen  arbitrarily  in  K .  The  computation  of 
^n-i-i/3  .g  gjj^p^y  gradient  descent  of  (23),  while  the  next 
two  steps  compute  the  projection  of  onto  K. 

We  remark  that  [24]  contains  bounds  for  the  differ¬ 
ence  in  7^2(7)  between  discrete  minimizers  of  (22)  and  the 
minimizer  of  (2)  as  the  mesh  size  h  0. 

The  notational  similarity  of  (23)  and  (11)  is  deliberate; 
both  can  be  formulated  as  minimizing  over  all  a;  in  a  closed 
convex  set  K 


(25)  F{x)  =  \\Ax-b\\^ 

for  some  bounded  linear  operator  A  and  vector  b.  In  our 
cases  we  have  x  =  p,  Ax  =  X/ i^-p,  b  =  //A,  and  K  is  given 
by  (10)  or  (19). 

We  consider  the  general  iteration 

(26)  =7rx(a;"-T(A*(Ax”-6))), 


where  ttk  is  the  orthogonal  projection  onto  the  set  K,  A* 
is  the  adjoint  of  the  operator  A,  and  r  is  suitably  small. 
In  other  words,  we  first  perform  gradient  descent  on  the 
functional  F{x)  and  then  project  the  intermediate  result 
onto  the  convex  set  K . 

The  convergence  of  this  algorithm  was  studied  by 
Eicke  [16],  Theorem  3.2,  and  is  a  special  case  of  a  gen¬ 
eral  theory  developed  later  by  Combettes  and  his  collab¬ 
orators  [12,  13].  For  pedagogical  purposes  we  recommend 
the  analysis  in  [16],  which  is  particularly  short  and  self 
contained.  The  result  applied  to  our  (finite-dimensional) 
problem  gives  the  following:  if  0  <  t  <  2||A||“^  then  x” 
converges  to  a  minimizer  of  F{x)  on  K.  (Part  of  the 
result  goes  back  to  Opial  [19].)  In  our  case  this  means 
that  the  method  converges  if  0  <  r  <  h? for  the  dis¬ 
crete  divergence  (7)  [9];  a  similar  argument  shows  that  for 
the  discrete  divergence  (20)  we  obtain  convergence  when 
0  <  T  <  h? /Id). 

The  iteration  (26)  is  efficient  only  if  ttj^.  A,  and  A* 
can  be  calculated  quickly.  In  our  case,  it  takes  0{N‘^) 
operations  to  calculate  V„  -p  or  —  V^/  on  an  fV  x  TV  image. 
For  the  set  K  defined  by  (10),  we  have  simply 

{t^kp)i  =  - 

max(l,  |p,|) 


For  K  defined  by  (19)  we  set 


(7ricp)i 


P^ 

max(l,  |pi|) 


where  Pi 


Pi  V  0. 


So  with  either  (10)  or  (19)  we  can  calculate  ttkP  on  an 
N  X  N  image  in  O(TV^)  operations. 


4.  A  Multiscale  Algorithm 


The  characterization  of  the  minimizer  (11)  allows  us 
the  following  observation:  We  need  only  construct  a  vector 
field  p  that  minimizes  F(p)  over  all  p  G  K.  Any  general 
iteration  of  the  form  (26)  converges  as  long  as  the  initial 
data  is  in  the  set  K.  We  propose  here  to  use  a  multiscale 
technique  to  get  a  good  approximation  p^  G  K  for  our 
iterations. 

We  consider  two  grids  in  7,  one  with  grid  spacing  2h 
and  one  with  grid  spacing  h.  We  will  construct  a  scalar 
injector  from  the  2/i-grid  to  the  /i-grid  (called  and  a 
scalar  projector  from  the  h  grid  to  the  2h  grid  (called  7^^). 
Similarly,  we  will  have  an  operator-dependent  injector  7^^ 
on  vector  fields.  Our  general  approach  will  then  be  as 
follows. 

Given  data  fh  on  a  grid  with  spacing  h,  we  calculate 

data 

f2h  =  Ih^  fh 

on  a  grid  with  spacing  2h.  We  then  calculate  the  minimizer 
P2h  of  (23)  with  data  f2h  using  our  iterative  algorithm  (not 
yet  specifying  the  initial  value  p°).  Next,  we  begin  the 
iteration  solving  (23)  with  data  fh  with  the  initial  vector 
field 

=  a^hP-ih- 
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We  now  explain  our  choice  of  and  Assume 
that  N  is  even  with  h  =  1/N,  and  define  the  discrete  inner 
product  on  N  X  N  arrays 

{u,v)h  =  UjV^ 

i 

where  the  sum  is  taken  over  all  i  =  (zi,  12)  with  0  <  Zi,  12  < 
N.  A  similar  inner  product  can  be  defined  on  N/2  x  N/2 
arrays  with  grid  spacing  2h: 

{U,v)2h  =  (2/1)^, 

i 

where  the  sum  now  is  over  all  i  =  (11,12)  with  0  <  zi,  12  < 
N/2. 

Our  injector  will  simply  be  the  constant  injector 
on  2  X  2  squares: 

~  ^b/2j  5 

where  [z/2j  =  ([zi/2j ,  [z2/2j )  and  [x\  is  the  largest  integer 
no  greater  than  x.  The  corresponding  projector  is 
defined  as  the  adjoint  of  with  respect  to  the  h-  and 
2ft,-inner  products,  i.e.,  for  an  iV  x  iV  grid  function  v  and 
a  N/2  X  N/2  grid  function  u 

{lLu,v)h  =  {u,ll^v)2h- 


A  direct  calculation  shows  that 

+  l’2i+(l,0)  +  ^'2i+(0.1)  +  ^2i+(l,l)), 

i.e.,  it  is  simply  the  average  of  the  values  of  u  on  2  x  2 
subgrids.  (The  factor  1/4  comes  in  because  of  the  different 
weights  in  the  two  inner  products.) 

After  we  calculate  the  minimizer  p2h  of  (23)  over  all 
p  G  K  on  the  grid  with  spacing  2h,  we  start  the  iteration 
on  the  grid  with  spacing  h  with 

=  ^hPih- 
— s 

Here  is  an  injector  that  satisfies 
(27)  h  •  ^hP^h  =  l2h^2h  ■  P2h- 


Specifically,  for  the  anisotropic  operators  (7)  and  (5)  we 
have 

.(1)  ^  Ji) 


{^hP)i  — 


1  P 


Li/2J 
(2) 

.Pp/2J 


■7'Lb-(i.o))/2j 

(2) 

■7'Lb-(0,i))/2j 


and  for  the  upwind  operators  (18)  and  (20)  we  have 


(4p)*  = 


(  ^'S2J 
Pvi2\ 

(3) 

Pp/2J 

\7^b/2j 


PLb-(i.0))/2j  ' 

(2) 

■7'Lb+(i,0))/2j 

(3) 

■7'Lb-(0.i))/2j 
■7'Lb+(0,i))/2j  ) 


In  both  cases,  we  know  p2h  minimizes 
\\^2h-p-hh/\f2h 


over  all  p  €  K2h-  Because  (27)  holds  we  know  that  4P2h 
minimizes  over  all  p  G  IShK2h  %  Kh 

\\^h-p-iLf2h/m. 

We  also  know  that 

l2hhh  -  AV?i  •  I^hP2h 

minimizes 

2 114/2/1  -  g\\h  +  AIsIbv'i 

over  all  g  G  4^2/1  2  Kh. 

We  don’t  know  a  reasonable  way  to  bound  the  error 
incurred  by  starting  the  iteration  on  the  grid  with  spacing 
h  with 

P°  =  T^KAP2h- 

There  are  three  sources  of  error:  (1)  we  use  4/2/1  — 
^2h^h^fh  ^s  the  data  instead  of  fh;  (2)  we  minimize  over 
all  p  G  l!/f^K2h  instead  of  over  Kh,  and  (3)  we  immediately 
project  l2hP2h  onto  Kh-  It  is  straightforward  to  bound  the 
first  error,  since  the  solutions  to  all  these  problems  are  L2 
contractions  so  the  difference  between  the  minimizers  of 

2 114/2/1  —  g\\h  +  AIsIbv'i 

and 

2II//1  “  dWh  +  Alfflev'* 

over  all  g  G  is  bounded  by 

II//1  -  l2hhh\\L.2  <  11/  -  //ilUa  +  11/  -  /2/i||l2 

<  C/z“|/|Lip(a.L.) 

whenever  /  is  in  the  Lipschitz  space  Lip(Q!,L2)-  We  don’t 
know  how  to  deal  with  the  other  two  errors. 

5.  Experimental  Results 

We  did  a  series  of  experiments  to  (a)  measure  the  ef¬ 
fectiveness  of  the  multiscale  predictor  for  the  initial  vector 
field  p,  (b)  examine  the  experimental  convergence  rates 
for  two  sets  of  initial  data  for  the  continuous  BV  problem 
with  known  analytic  solutions,  and  (c)  illustrate  some  of 
the  qualitative  properties  of  the  solutions  to  the  discrete 
problems. 

All  of  our  algorithms  compute  only  an  approximate 
minimizer  of  discrete  BV  problems — we  must  decide  when 
to  stop  the  iteration  (26).  In  [10]  one  finds  a  simple  error 
bound;  first  we  let 

r  =  f-\Vh-p^. 

Then  we  have 
(28) 

11/ -  /”|li5(,)  <  A(|/"|bvV/)  -  {^hr,Pn)  =  e(p”)^- 

So  we  can  ensure  that  we’ve  computed  /  to  within  an  error 
of  e,  i.e., 

11/  ~  /”IIl5(7)  ^  e. 
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if  we  iterate  until  e(p”)  <  e.  It  is  possible  that  the  true 
error  is  much  less  than  the  bound,  but  we  don’t  know  that. 

In  our  computations  the  initial  data  is  255  times  the 
characteristic  function  of  a  disk  or  of  a  square,  and  we  set 
e(p")  =  1/4,  so  we  ensure  that  the  L^il)  error  between 
our  computed  discrete  minimizer  and  the  exact  discrete 
minimizer  is  <  1/4  greyscales. 

We  begin  by  discussing  the  efficiency  improvements  we 
observed  by  our  multiscale  method.  As  mentioned  in  the 
previous  paragraph,  we  do  not  compute  exact  discrete  min- 
imizers,  but  only  approximate  discrete  minimizers.  Our 
final  goal  is  to  compute  //*  such  that 

Wfh  —  /ft  IIl5(/)  <  Eft 

for  some  eft.  We  begin  the  iterative  algorithm  with  some 
initial  guess  for  p/.  Without  our  multiscale  algorithm  we 
take  =  0.  Our  multiscale  algorithm  says  that  we  should 
iteratively  compute  p^ft  ^  with  grid  size  2h  until 

II /2ft  -  /4IIl|'*(/)  <  e2ft 

and  then  set 


In  our  computations  we  take  €2h  =  Cft  =  1/4.  It  is  possible 
that,  given  eft  =  1/4,  a  better  choice  of  e2h  can  improve 
our  initial  vector  field  p°. 

We  compare  the  computational  effort  needed  by  our 
multiscale  method  and  the  iterative  method  operating 
solely  on  the  grid  with  grid  spacing  h  and  p°  =  0.  We 
note  that  each  iteration  p^ft  ^  P2h^  a  grid  with  mesh 
size  2h  takes  (roughly)  1  /4  as  many  operations  as  one  iter¬ 
ation  on  the  grid  of  size  h.  Thus,  if  the  number  of  iterations 
on  grids  with  spacing  h,  2h,  Ah,  etc.,  are  N^,  A^2ft,  -^4ft, 
respectively,  we  report  the  number  of  equivalent  iterations 
on  the  finest  grid, 

Nh  +  T-^2ft  +  +  •  •  •  . 

4  16 

In  our  examples  we  take  the  data  to  be  /  =  255X[i  2]2, 
the  characteristic  function  of  a  subsquare  with  sidelength 
1/2  inside  the  computational  domain  of  [0,1]^.  We  use 
Dirichlet  boundary  conditions.  We  computed  numerical 
solutions  on  grids  with  h  =  1/128,  1/256,  and  1/512.  We 
chose  three  values  of  A  for  which  the  L2{I)  distance  be¬ 
tween  the  solution  of  the  continuous  problem  (2)  and  the 
initial  data  /  is  16,  32,  and  64.  For  this  purpose  we  use  the 
characterization  of  the  exact  solutions  (see  [5] ,  Section  4  in 
[2],  or  Appendix  1  in  [1])  and  found  that  the  corresponding 
values  of  A  are  3.771636443,  7.820179629,  and  16.26268646, 
respectively. 

A  summary  of  the  iteration  count  to  solve  both  prob¬ 
lems  (6)  and  (22)  such  that  the  error  bound  (28)  is  <  1/4, 
both  with  and  without  the  multiscale  approach,  are  sum¬ 
marized  in  Tables  1  and  2.  The  iteration  count  with  the 
multiscale  approach  is  reported  as  the  equivalent  number 
of  iterations  at  the  finest  resolutions  as  described  above. 


Table  1 

Iteration  count  without  the  multiscale  algorithm; 
columns  1-3  are  the  result  of  (6);  columns  4-6  are  the  result  of  (22) 


128 

256 

512 

16 

4,815 

10,466 

36,653 

32 

21,772 

45,744 

103,817 

64 

119,468 

255,096 

514,060 

16 

4,293 

17,173 

68,324 

32 

5,414 

21,162 

83,908 

64 

13,049 

33,158 

113,843 

columns  1- 

Table  2 

Iteration  count  with  the  multiscale  algorithm; 

-3  are  the  result  of  (6);  columns  4-6  are  the  result  of  (22) 

16 

32 

64 

16 

32 

64 

128 

1,393 

2,358 

10,047 

1,694 

2,574 

3,476 

256 

4,525 

6,722 

12,250 

5,460 

8,851 

12,484 

512 

14,615 

22,328 

33,115 

17,197 

30,676 

44,289 

We  see  that  the  multiscale  method  of  choosing  p° 
speeds  up  the  computation,  and  greatly  so  in  some  cases. 

Next  we  discuss  the  observed  error  between  the  min¬ 
imizer  /  of  the  continuous  problem  (2)  and  the  approx¬ 
imate  minimizers  //*  (with  an  error  in  L^il)  of  <  0.25) 
of  the  two  discrete  problems  (6)  and  (22).  For  the  three 
problems  with 

(29)  /  =  255X[1,|]2, 

and  A  equal  to  3.771636443,  7.820179629,  and  16.26268646 
(corresponding  to  ||/  —  /||l2(/)  =  16,  32,  and  64,  respec¬ 
tively)  we  computed  numerical  approximation  to  the  exact 
solutions  on  a  grid  of  size  2048  x  2048;  the  value  of  the  nu¬ 
merical  approximation  on  the  subsquare  2c®  (^+*)  taken 
to  be 

A  simple  geometric  argument  shows  that  this  approxima¬ 
tion  is  a  near-best  piecewise  constant  projection  in  7^2(7) 
of  /  onto  a  2,048  x  2,048  grid  (just  follow  the  argument 
for  Example  2  in  Section  III.E  of  [15],  as  the  measure  of 
the  subgrid  square  where  /  is  less  than  the  value  (30)  is  no 
less  than  1/4,  and  no  greater  than  3/4,  times  the  measure 
of  the  subgrid  square). 

Table  3 

L2(/)  errors  on  grids  of  size  128,  256,  and  512,  and 
differences  \\f  —  fWL^il)  o/ 16,  32,  and  64,  with  initial  data  (29); 
columns  1-3  are  the  result  of  (6);  columns  4-6  are  the  result  of  (22); 
a  is  the  estimated  order  of  convergence,  \\f  —  fh\\L2(l)  ~  Ch°‘. 


16 

32 

64 

16 

32 

64 

128 

1.613 

1.889 

2.113 

1.533 

1.813 

2.045 

256 

0.962 

1.134 

1.249 

0.900 

1.041 

1.145 

512 

0.554 

0.654 

0.733 

0.508 

0.578 

0.639 

a 

0.772 

0.765 

0.764 

0.796 

0.824 

0.839 

We  compute  piecewise  constant  approximations  /^  of 
the  minimizers  of  (6)  and  (22)  on  grids  of  size  128  x  128, 
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Figure  1.  The  solution  of  the  discrete  BV  problem  using  the 
anisotropic  definition  (8). 


Figure  2.  The  solution  of  the  discrete  BV  problem  using  the 
upwind  definition  (17). 


Figure  3.  The  solution  of  the  discrete  BV  problem  using  the 
anisotropic  definition  (8)  in  ‘false  color”. 


Figure  4.  The  solution  of  the  discrete  BV  problem  using  the 
upwind  definition  (17)  in  ‘false  color”. 


256  X  256,  and  512  x  512,  and  measure  the  L2{I)  distance 
between  these  approximations  and  the  projection  onto 
a  2048  X  2048  grid  of  /  given  above.  These  differences  are 
reported  in  Table  3. 

We  also  computed  solutions  approximate  solutions 
when  /  is  255  times  the  characteristic  function  of  the  disk 

(31)  /  =  255X|,_, 


with  11/  —  /||l2(/)  =  16,  32,  64.  Again  we  used  Dirichlet 
boundary  conditions  and  the  discrete  minimization  prob¬ 
lems  (6)  and  (22);  again  we  ensured  that  e(p"')  <  1/4, 
so  we  know  that  \\fh  —  fh\\L2ii)  ^  1/4-  The  exact  solu¬ 
tion  /  is  simply  a  multiple  of  the  characteristic  function 
of  same  disk  such  that  ||/  —  f\\L2(i)  is  the  correct  value. 
The  values  of  ||/  —  /||l2(/)  =  16,  32,  and  64  correspond  to 
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Figure  5.  The  solution  of  the  discrete  BV  problem  using  the 
anisotropic  definition  (8)  (128x128). 


Figure  6.  The  solution  of  the  discrete  BV  problem  using  the 
upwind  definition  (17)  (128x128). 


Figure  7.  The  solution  of  the  discrete  BV  problem  using  the 
anisotropic  definition  (8)  in  “false  color”  (128x128). 


Figure  8.  The  solution  of  the  discrete  BV  problem  using  the 
upwind  definition  (17)  in  “false  color”  (128x128). 


A  -  4.5134516668,  9.02703337,  and  18.05406674,  respec¬ 
tively.  We  compared  the  same  piecewise  constant  approx- 
imation  (30)  to  /  on  a  2048  x  2048  grid  to  piecewise  con¬ 
stant  approximations  the  discrete  solutions  of  (6)  and 
(22)  on  grids  of  size  128  x  128,  256  x  256,  and  512  x  512. 
Here  the  piecewise  constant,  discrete  initial  data  fh  is  not 
equal  to  the  true  initial  data  /;  we  used  the  same  inter¬ 
polation  method  (30)  to  compute  our  discrete  data  fh  on 


grids  of  size  128  x  128,  etc.  The  results  are  reported  in 
Table  4. 

With  both  the  characteristic  function  of  the  disk  and 
the  square  as  data,  the  solution  is  in  the  Sobolev  space 
]4//3,2  Jq],  p  most  1/2,  so  one  might  suspect  that  the  max¬ 
imum  possible  rate  of  approximation  in  L2{I)  by  piecewise 
constants  is  0{h^),  and  this  is,  roughly,  what  one  observes 
when  the  data  is  a  multiple  of  the  characteristic  function 


of  a  disk.  When  the  data  is  a  multiple  of  the  characteris¬ 
tic  function  of  the  square,  however,  the  discontinuities  in 
the  solution  are  aligned  with  the  computational  grid,  and 
the  convergence  rate,  which  is  at  most  one  for  piecewise 
constant  approximations,  is  limited  by  the  smoothness  of 
/  inside  only  the  subsquare  [j,  |]^,  and  the  experimental 
convergence  rate  is  clearly  above  1/2. 

Next  we  discuss  the  qualitative  properties  of  the  nu¬ 
merical  solutions. 

The  anisotropy  of  the  operator  (7)  was  briefly  noted 
in  the  previous  section.  The  operator  (20)  was  offered  as 
a  “more  isotropic”  operator,  but  it,  too,  is  fundamentally 
anisotropic — if  a  function  has  discontinuities  across  curves 
that  are  not  vertical,  horizontal,  or  diagonal  lines  then 
indeed 

|6'Ibv'*(/)  ^  I5Ibv(/) 

even  for  the  “upwind”  discrete  BV  semi-norm.  Here  we 
briefly  give  two  examples  that  illustrate  the  effect  of  this 
anisotropy. 

As  usual,  we  work  only  with  Dirichlet  boundary  con¬ 
ditions.  We  begin  with 

f  =  255XD, 

where  D  is  the  disk  with  center  (|,  |)  and  radius  j.  Using 
the  iteration  (26),  we  then  find  the  approximate  minimizer 
over  all  discrete  g  with 

Wf-gh^ii)  <  64 

of 

llffllBV'‘(/) 

for  both  the  anisotropic  definition  (8)  and  the  “upwind” 
definition  (17)  of  the  discrete  BV  semi-norm.  The  exact 
minimizer  of  the  continuous  problem  in  this  case  is  known 
[22]  to  be  (255  —  Xd,  where  r  =  j  is  the  radius  of  the 
disk  and  A  (satisfying  A  >  4)  is  chosen  so  that 

||255Xd-  (^255  - XdUl.u)  =  64. 

We  use  the  error  bound  e(p”)  to  ensure  that  the  L^il) 
errors  are  no  greater  than  1/4. 

Table  4 

L2(/)  errors  on  grids  of  size  128,  256,  and  512,  and 
differences  ||/  —  fWL^ii)  with  initial  data  (31); 

columns  1-3  are  the  result  of  (6);  columns  4-6  are  the  result  of  (22) 
a  is  the  estimated  order  of  convergence,  \\f  —  fh\\L2(i)  ~  Ch°‘. 


16 

32 

64 

16 

32 

64 

128 

10.637 

9.223 

6.004 

9.925 

8.312 

5.143 

256 

7.929 

6.981 

4.542 

7.061 

6.051 

3.795 

512 

6.029 

5.360 

3.495 

5.185 

4.503 

2.852 

a 

0.410 

0.392 

0.390 

0.468 

0.442 

0.425 

In  our  experiments  we  set  h  =  1/128.  The  results  are 
shown  in  Figure  1  for  the  anisotropic  discrete  BV  norm  (8) 
and  in  Figure  2  for  the  upwind  discrete  BV  norm  (17).  If 
one  looks  closely,  one  sees  that  the  “northeast”  and  “south¬ 
west”  borders  of  the  solution  disk  in  the  anisotropic  solu¬ 
tion  are  more  smoothed  than  other  parts  of  the  border, 
and  the  “upwind”  solution  has  generally  a  sharper  border 
everywhere. 

To  illustrate  this  phenomenon  more  clearly,  we  include 
“false  color”  images  of  Figures  1  and  2  as  Figures  3  and  4, 
respectively.  Here  each  greyscale  was  assigned  an  arbitrary 
color  to  show  how  much  the  borders  are  smoothed  in  each 
of  the  solutions.  For  example,  the  pixels  with  a  greyscale 
of  0  were  colored  with  the  terra-cotta-like  color;  the  same 
mapping  of  greyscales  to  colors  was  used  in  both  images. 

Some  things  stick  out  immediately  from  the  false-color 
images.  First,  the  greyscale  values  of  the  central  plateau  of 
the  discrete  solutions  are  different,  but  their  difference  of 
only  one  greyscale  value  (113  in  the  anisotropic  image,  112 
in  the  greyscale  image)  could  be  explained  by  numerical 
error. 

Second,  the  smoothing  of  the  border  of  the  solution 
truly  is  anisotropic  in  the  “anisotropic”  image,  and  it  is 
smoothed  over  a  distance  of  about  9  pixels  in  the  northeast 
and  southwest  directions,  which  is  significant  (and  which 
cannot  be  due  to  the  discrete  error,  which  as  noted  above 
is  no  more  than  0.25  RMS  greyscales).  The  smoothing 
of  the  upwind  solution  is  spread  over  a  noticeably  smaller 
distance.  The  notion  that  “BV  preserves  edges”,  while  true 
in  the  continuous  setting,  clearly  needs  some  qualification 
in  the  discrete  setting. 

There  are  precisely  three  places  where  there  is  a  one- 
pixel  jump  from  the  plateau  of  the  disk  to  the  background 
color  in  the  anisotropic  image — at  the  right,  the  bottom, 
and  the  “northwest”  corner.  In  the  upwind  image,  there 
are  four  such  places,  at  the  left  and  right  and  top  and 
bottom  edges  of  the  plateau. 

Figures  5-8  show  similar  qualitative  effects  when  the 
initial  data  is  the  square  [q,  |]^- 

To  illustrate  the  damage  to  the  multiscale  algorithm 
by  (necessarily)  projecting  the  injected  vector  field 

t^hP2h 

onto  the  convex  set  we  illustrate 

(32)  -  AV„  •  =  lLif2h  -  XV2h  ■  P2h), 

which  is  just  the  injection  of  the  solution  at  a  scale  of  2h 
into  the  space  of  piecewise  constants  with  scale  h,  and 

(33)  lU2h-Xyn-{T^K,I^hP2h). 

which  is  close  to  the  initial  approximation  to  the  solution 
at  scale  h.  (It  would  be  precisely  the  initial  approximation 
at  scale  h  if  we  had  fh  instead  of  I2hf2h-)  We  take  h  = 
1  /128  and  the  anisotropic  method  (8).  Figure  9  shows  (32), 
which  is  the  64x  64  solution  injected  into  the  128  x  128  grid; 
Figure  10  shows  (33),  i.e.,  what  happens  when  we  project 
the  initial  vector  field  onto  Kh-  The  effect  is  so  severe 
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Figure  9.  The  image  (32)  with  h  =  1/128  and  (8)  with 
\\f-f\\L,(i)  =  e4.. 


Figure  10.  The  image  (33)  with  h  =  1/128  and  (8)  with 


because  the  operator  V h  magnifies  any  changes  incurred 
by  T^Kh  by  a  factor  of  l/h. 

These  figures  strongly  suggest  that  there  should  be  a 
— 

better  way  to  map  I^hP^h  onto  Kh-  For  example,  for  any 
divergence-free  vector  field  qh  (with  V/i  •  g/j  =  0),  we  could 
start  just  as  well  with 

(34)  Ph  =  '^Kn{'^hP2h-  qh) 


and  still  satisfy  (cf.  (27)) 

h  ■  {J^hP2h  —  Qh)  =  '  P2h- 

If  we  could  find  qh  so  that 

^hP2h  —  qh  ^  Kh, 

for  example,  then  using  (34)  as  the  initial  vector  field  would 
result  in  Figures  9  and  10  being  identical. 

We  now  consider  the  example 

/  =  255Xs 

where  S  is  the  square  [|:,  |]  x  [j,  |].  As  in  the  last  example, 
we  found  the  approximate  minimizer  of  IlffUsy  satisfying 
II  /  ~  sll  (/)  ^  64.  Allard  gives  the  exact  minimizer  of  the 
continuous  problem  in  an  appendix  to  [1]  (see  Figure  11). 


Figure  11.  The  projection  onto  a  256  X  256  grid  of  the  exact 
solution  of  the  continuous  BV  problem,  in  “false  color”. 


Both  the  original  and  upwind  solutions  preserve  sharp 
jumps  at  the  sides  of  the  squares,  however,  we  see  some 
significant  differences  near  the  corners.  As  expected,  the 
upwind  scheme  yields  the  same  type  of  smoothing  at  each 
of  the  four  corners,  which  seems  to  match  the  true  solu¬ 
tion  (see  Figure  11).  On  the  other  hand,  the  anisotropy  of 
the  original  scheme  is  evident  in  the  behavior  of  the  four 
corners.  It  is  also  curious  to  note  that  the  right  and  bot¬ 
tom  sides  of  the  square  are  much  better  preserved  than  the 
top  and  left  ones.  Furthermore,  the  truncated  top  left  cor¬ 
ner  appears  to  be  favoring  a  sharp  jump  over  what  should 
be  a  more  gradual  ramp.  Finally,  we  note  the  magnified 
anisotropy  of  the  ramps  in  the  northwest  and  southeast 
corners. 
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